-
Notifications
You must be signed in to change notification settings - Fork 2.1k
[FLINK-38531][cdc-connector-mysql]Fix data loss when restoring from a checkpoint positioned in the middle of a bulk DML operation. #4165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
1e015c2 to
ca06f6a
Compare
|
Please help review this PR. Thank you. @lvyanquan |
866aff3 to
3764c6d
Compare
|
@yuxiqian Would you like to help review this PR when you have time? |
|
Hi @5herhom, Thank you for pointing out this issue. Since |
|
62df712 to
bde2afd
Compare
… checkpoint positioned in the middle of a bulk DML operation.
1f2b7b7 to
9450c39
Compare
|
|
|
||
| /** | ||
| * In a bad case, it will skip the rest records whitch causes infinite wait for empty data. So | ||
| * it should has a timeout to avoid it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whitch => which.
Could you explain when will a 'bad case' happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain when will a 'bad case' happen?
This bad case occurred before this fix. If the code before this fix is used to run this case, the situation will occur.
| return Long.compare(restartSkipEvents, targetRestartSkipEvents); | ||
| } | ||
| // The completed events are the same, so compare the row number ... | ||
| return Long.compare(this.getRestartSkipRows(), that.getRestartSkipRows()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is reasonable for me.
Could @yuxiqian or @ruanhang1993 double check for this too?
Detail: FLINK-38531
Before fixed, it can not pass the newly added case BinlogOffsetTest#testCompareToWithGtidSetAndSkipEventsAndSkipRows and BinlogSplitReaderTest#testRestoreFromCheckpointWithGtidSetAndSkippingEventsAndRows.
Before fixed, the situation of data loss can be reproduced under the the newly added case BinlogSplitReaderTest#testRestoreFromCheckpointWithGtidSetAndSkippingEventsAndRows.